Model Selection

Low-latency speech synthesis

# Low-latency speech synthesis

Llama3.1 Typhoon2 Audio 8b Instruct

Typhoon 2-Audio Edition is an end-to-end speech-to-speech model architecture capable of processing audio, speech, and text inputs while simultaneously generating both text and speech outputs. The model is specifically optimized for Thai language while also supporting English.

Transformers Supports Multiple Languages

Mms Spa Finetuned Colombian Monospeaker

This is a Spanish TTS model based on MMS, fine-tuned using the VITS architecture, requiring only 80-150 samples and 20 minutes of training time to generate Spanish speech with a Colombian accent.

Speech Synthesis

Transformers Spanish

Mms Spa Finetuned Argentinian Monospeaker

This is a fine-tuned model based on the MMS Spanish version, built using the VITS architecture, trained with only 80 to 150 samples in approximately 20 minutes.

Speech Synthesis

Transformers Spanish

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase